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(57) Abstract 



A metliod of encoding a speech signal is disclosed. This method improves the excitation codebook and search procedure of 
the conventional Code Excited Linear Prediction (CELP) speech encoders. Use is made of a dynamic codebook (201, 202) based 
on a combination of two modules: a sparce algebraic code generator (201) associated to a filter (202) having a transfer function 
varying in time. The generator (201) is a structured codebook with codewords having very few non zero components. The filter 
(202) shapes the spectral characteristics whereby the resulting excitation codebook (201, 202) exhibits favorable perceptual pro- 
perties. The search complexity in finding the best codeword is greatly reduced by bringing the search back to the algebraic code 
domain thereby allowing the sparcity of the algebraic code to speed up the necessary computations. 
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10 

BACKGROTJTJn Of THE TNVENTIOW 



1. Fiel d of the invenfci^ n? 

15 

The present invention relates to a new 
technique for digitally encoding and decoding in 
particular but not exclusively speech signals in view 
of transmitting and synthesizing these speech signals. 

20 

2. Brief descripti on of the prior ayti 



Efficient digital speech encoding techniques 
with good subjective quality/bit rate tradeoffs are 
increasingly in demand for numerous applications such 
as voice transmission over satellites, land mobile, 
digital radio or packed network, for voice storage, 
voice response and secure telephony. 

One of the best prior art methods capable of 
achieving a good quality/bit rate tradeoff is the so 
called Code Excited Linear Prediction (CELP) 
technique. In accordance with this method, the speech 
signal is sampled and converted into successive blocks 
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of a predetermined number of samples. Each block of 
samples is synthesized by filtering an appropriate 
innovation sequence from a codebook, scaled by a gain 
factor, through two filters having transfer functions 

5 varying in time. The first filter is a Long Term 
Predictor filter (LTP) modeling the pseudoperiodicity 
of speech, in particular due to pitch, while the 
second one is a Short Term Predictor filter (STP) 
modeling the spectral characteristics of the speech 

0 signal. The encoding procedure used to determine the 
parameters necessary to perform this synthesis is an 
analysis by synthesis technique. At the encoder end, 
the synthetic output is computed for all candidate 
innovation sequences from the codebook. The retained 

5 codeword is the one corresponding to the synthetic 
output which is closer to the original speech signal 
according to a perceptually weighted distortion 
measure . 

0 The first proposed structured codebooks are 

called stochastic codebooks. They consist of an 
actual set of stored sequences of N random samples. 
More efficient stochastic codebooks propose derivation 
of a codeword by removing one or more elements from 

5 the beginning of the previous codeword and adding one 
or more new elements at the end thereof. More 
recently, stochastic codebooks based on linear 
combinations of a small set of stored basis vectors 
have greatly reduced the search complexity. Finally, 

0 some algebraic structures have also been proposed as 
excitation codebooks with efficient search procedures. 
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15 



However, the latter are designed for speed and they 
lack flexibility in constructing codebooks with good 
subjective quality characteristics. 



egJECTS OF THK r^l VENTTftW 



The main object of the present invention is to 
combine an algebraic codebook and a filter with a 
transfer function varying in time, to produce a 
dynamic codebook offering both the speed and memory 
saving advantages of the above discussed structured 
codebooks while reducing the computation complexity of 
the code Excited Linear Prediction (CELP) technique 
and enhancing the subjective quality of speech. 



gUWMARY OP THR TNVENTTOyj 

20 



25 



30 



More specifically, in accordance with the 
present invention, there is provided a method of 
producing an excitation signal that can be used in 
synthesizing a sound signal, comprising the steps of 
generating a codeword signal in response to an index 
szgnal associated to this codeword signal, such signal 
generating step using an algebraic code to generate 
the codeword signal, and filtering the so generated 
codeword signal to produce the excitation signal. 
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Advantageously, the algebraic code is a sparce 
algebraic code. 

The sxibject invention also relates to a dynamic 
5 codebook for producing an excitation signal that can 
be used in synthesizing a sound signal, comprising 
means for generating a codeword signal in response to 
an index signal associated to this codeword signal, 
which signal generating means using an algebraic code 
10 to generate the codeword signal, ind means for 
filtering the so generated codeword signal to produce 
the excitation signal. 



15 



In accordance with a preferred embodiment of 
the dynamic codebook, the filtering means comprises a 
coloring filter having a transfer function varying in 
time to shape the frequency characteristics of the 
excitation signal so as to damp frequencies 
perceptually annoying the human ear. This coloring 
20 filter comprises an input supplied with linear 
predictive coding parameters representative of 
spectral characteristics of the the sound signal to 
vary the above mentioned transfer function. 



25 



30 



In accordance with other aspects of the present 
invention, there is also provided: 

(1) a method of selecting one particular 
algebraic codeword that can be processed to produce a 
signal excitation for a synthesis means capable of 
synthesizing a sound signal, comprising the steps of 



BNSDOCID: <WO 9113432A1> 



wo 91/13432 PCT/CA90/00381 



(a) Whitening the sound signal to be synthesized to 
generate a residual signal, (b) computing a target 
signal Z by processing a difference between the 
residual signal and a long tern prediction component 
5 of the signal excitation, (c) baclcward filtering the 
target signal to calculate a value D of this target 
signal in the domain of. an algebraic code, (d) 
calculating, for each codeword among a plurality of 
available algebraic codewords Ak expressed in the 
10 algebraic code, a target ratio which is function of 
the value D, the codeword Ak, and a transfer function 
H « D / X , and (e) selecting the said one particular 
codeword among the plurality of available algebraic 
codewords in function of the calculated target ratios. 



15 



20 



(2) an encoder for selecting one particular 
algebraic codeword that can be processed to produce a 
signal excitation for a synthesis means capable of 
synthesizing a sound signal, comprising (a) means for 
whitening the sound signal to be synthesized and 
thereby generating a residual signal, (b) means for 
computing a target signal X by processing a difference 
between the residual signal and a long term prediction 
component of the signal excitation, (c) means for 
25 backward filtering the target signal to calculate a 
value D of this target signal in the domain of an 
algebraic code, (d) means for calculating, for each 
codeword among a plurality of available algebraic 
codewords Ak expressed in the above mentioned 
30 algebraic code, a target ratio which is function of 
the value D, the codeword Ak, and a transfer function 
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" " ^ / ^ ' and (e) means for selecting the said one 
particular codeword among the plurality of available 
algebraic codewords in fxinction of the calculated 
target ratios. in accordance with preferred 

5 embodiments of the encoder, the target ratio comprises 
a numerator given by the expression P^(k) = (DAX')* 
and a denominator given by the expression a^k^lAJcH^I^, 
where Ak and H are under the form of matrix, each 
codeword Xk is a waveform comprising a small nvimber of 
10 non-zero impulses each of which can occupy different 
positions in the waveform to thereby enable 
composition of different codewords, the target ratio 
calculating means comprises means for calculating into 
a plurality of embedded loops contributions of the 
15 non-zero impulses of the considered algebraic codeword, 
to the numerator and denominator and for adding the so 
calculated contributions to previously calculated sum 
values of these niimerator and denominator, 
respectively, the embedded loops comprise an inner 
20 loop, and the codeword selecting means comprises means 
for processing in the inner loop the calculated target 
ratios to determine an optimized target ratio and 
means for selecting the said one particular algebraic 
codeword in function of this optimized target ratio. 



25 



30 



(3) a method of generating at least one long 
term prediction parameter related to a sound signal in 
view of encoding this sound signal, comprising the 
steps of (a) whitening the sound signal to generate a 
residual signal, (b) producing a long term prediction 
component of a signal excitation for a synthesis means 
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component of a signal excitation for a synthesis means 
capable of synthesizing the sound signal, which 
producing step including estimating an unknown portion 
of the long term prediction component with the 
5 residual signal, and (c) calculating the long term 
prediction parameter in function of the so produced 
long term prediction component of the signal 
excitation. 

(4) a device for generating at least one long 
term prediction parameter related to a sound signal in 
view of encoding this sound signal, comprising (a) 
means for whitening the sound signal and thereby 
generating a residual signal, (b) means for producing 
15 a long term prediction component of a signal 
excitation for a synthesis means capable of 
synthesizing the sound signal, these producing means 
including means for estimating an unknown portion of 
the long term prediction component with the residual 
signal, and (c) means for calculating the long term 
prediction parameter in function of the so produced 
long term prediction component of the signal 
excitation. 



20 



25 



30 



The objects, advantages and other features of 
the present invention will become more apparent upon 
reading of the following, non restrictive description 
of a preferred embodiment thereof, given with 
reference to the accompanying drawings. 
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In the appended drawings: 

5 

Figure l is a schematic block diagram of the 
preferred embodiment of an encoding device in 
accordance with the present invention; 

^° Figure 2 is a schematic block diagram of a 

decoding device using a dynamic codebook in accordance 
with the present invention; 



15 



20 



25 



Figure 3 is a flow chart showing the sequence 
of operations performed by the encoding device of 
Figure l; 

Figure 4 is a flow chart showing the different 
operations carried out by a pitch extractor of the 
encoding device of Figure 1, for extracting pitch 
parameters including a delay T and a pitch gain b; and 

Figure 5 is a schematic representation of a 
plurality of embedded loops used in the computation of 
optimum codewords and code gains by an optimizing 
controller of the encoding device of Figure 1. 
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PETAILgP PFSCRIPTTON OF THK pp Ti FF PRRn Fm^p TT^py. 



5 Figure i is the general block diagram of a 

speech encoding device in accordance with the present 
invention. Before being encoded by the device of 
Figure 1, an analog input speech signal is filtered, 
typically in the band 200 to 3400 Hz and then sampled 
10 at the Nyquist rate (e.g. 8 kHz). The resulting 
signal comprises a train of samples of varying 
amplitudes represented by 12 to 16 bits of a digital 
code. The train of samples is divided into blocks 
which are each L samples long. in the preferred 
15 embodiment of the present invention, L is equal to 60. 
Each block has therefore a duration of 7.5 ms. The 
sampled speech signal is encoded on a block by block 
basis by the encoding device of Figure i which is 
broken down into 10 modules numbered from 102 to ill. 
2 0 The sequence of operation performed by these modules 
will be described in detail hereinafter with reference 
to the flow chart of Figure 3 which presents numbered 
steps. For easy reference, a step number in Figure 3 
and the number of the corresponding module in Figure 
25 1 have the same last two digits. Bold letters refer 
to L-sample-long blocks (i.e. L-component vectors). 
For instance, 8 stands for the block tS(l), 
S(2),...S(L)]. 



30 



gtep 301; , The next block 8 of L samples is supplied to 
the encoding device of Figure l. 
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^^^P ^^^'^ ^^^^ block of L samples of speech 

signal, a set of Linear Predictive coding (LPC) 
parameters, called STP parameters, is produced in 
accordance with a prior art technique through an LPC 
spectrum analyser 102 . More specifically, the latter 
analyser 102 models the spectral characteristics of 
each block 8 Of samples, in the preferred embodiment, 
the parameters STP comprise a number M=10 of 
prediction coefficients [al, a2,...aM]. One can refer 
to the book by J.D. Markel « A.H. Gray, Jri -Linear 
Prediction of Speech" Springer Verlag (1976) to obtain 
information on representative methods of generating 
these parameters. 

^^^P ^^^'^ i"P"t block 8 is whitened by a whitening 

filter 103 having the following transfer function 
based on the current values of the STP prediction 
parameters : 



M 

A( 2) =5^3,2'' 



f-0 



(1) 



25 



30 



Where a^ - l, and z represents the variable of the 
polynomial A(2) . 

As illustrated in Figure i, the filter 103 
produces a residual signal R. 
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Of course, as the processing is perfoimed on a 
block basis, unless otherwise stated, all the filters 
are assumed to store their final state for use as 
initial state in the following block processing. 

The purpose of step 304 is to compute the 
speech periodicity characterized by the Long Term 
Prediction (LTP) parameters including a delay T and a 
pitch gain b. 

Before further describing step 304, it is 
useful to explain the structure of the speech decoding 
device of Figure 2 and understand the principle upon 
which speech is synthesized. 

As shown in Figure 2, a demultiplexer 205 
interprets the binary information received from a 
digital input channel into four types of parameters, 
namely the parameters STP, LTP, k and g. The current 
block S of speech signal is synthetized on the basis 
of these four parameters as will be seen hereinafter. 



The decoding device of Figure 2 follows the 
classical structure of the CELP (Code Excited Linear 

25 Prediction) technique insofar as modules 201 and 202 
are considered as a single entity: the (dynamic) 
codebook. The codebook is a virtual (i.e. not 
actually stored) collection of L-sample-long waveforms 
(codeword) indexed by an integer k. The index k 

30 ranges from 0 to NC-1 where NC is the size of the 
codebook. This size is 4096 in the preferred 
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embodiment. In the CELP technique, the output speech 
signal is obtained by first scaling the k*" entry of 
the codebook by the pitch gain g through an amplifier 
206. An adder 207 adds the so obtained scaled 
waveform, gck, to the output B (the long term 
prediction component of the signal excitation of a 
synthesis filter 204) of a long term predictor 203 
placed in a feedback loop and having a transfer 
function B(2) defined as follows: 



B(2)=bz-T ^2) 



15 where b and T are the above defined pitch gain and 
delay, respectively. 

The predictor 203 is a filter having a transfer 
function influenced by the last received LTP 
20 parameters b and T to model the pitch periodicity of 
speech. It introduces the appropriate pitch gain b 
and delay of T samples. The composite signal gCk + B 
constitutes the signal excitation of the sythesis 
filter 204 which has a transfer function 1/A(2) . The 
filter 204 provides the correct spectrum shaping in 
accordance with the last received STP parameters. 
More specifically, the filter 204 models the resonant 
frequencies (formants) of speech. The output block 8 
is the synthesized (sampled) speech signal which can 
be converted into an analog signal with proper anti- 



25 



30 
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aliasing filtering in accordance with a technique well 
Jcnown in the art. 

In the present invention, the codebook is 
5 dynamic; it is not stored but is generated by the two 
modules 201 and 202. in a first step, an algebraic 
code generator 201 produces in response to the index 
k and in accordance with a Sparce Algebraic Code (SAC) 
a codeword Ak formed of a L-sample-long waveform 
10 having very few non zero components. In fact, the 
generator 201 constitutes an inner, structured 
codebook of size NC. m a second step, the codeword 
Ak from the generator 201 is processed by a coloring 
filter 202 whose transfer function F(z) varies in time 
15 in accordance with the STP parameters. The filter 202 
colors, i.e. shapes the frequency characteristics 
(dynamically controls the frequency) of the output 
excitation signal Ck so as to damp a priori those 
frequencies perceptually more annoying to the human 
20 ear. The excitation signal ck, sometimes called the 
innovation sequence, takes care of whatever part of 
the original speech signal left unaccounted by either 
the above defined formant and pitch modelling, m the 
preferred embodiment of the present invention, the 
transfer function F(z) is given by the following 
relationship: 



25 



a( zy 

a( ZYg-^) (3) 



where Yi=.7 and Yj-.SS. 
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There are many ways to design the generator 
201. An advantageous method consists of interleaving 
four single-pulse permutation codes as follows. The 
5 codewords Ak are composed of four non zero pulses with 
fixed amplitudes, namely s,=l, Sg^-i, 83=1, and S^=-l. 
The positions allowed for.S, are of the form p,=2i+8m,- 
1, where m,=0, 1, 2, ...7. it should be noted that for 
n3=7 (or m^-7) the position p, (or pj falls beyond 
10 L=60. In such a case, the impulse is simply 
discarded. The index k is obtained in a 

straightforward manner using the following 
relationship: 



15 



)c = 512 + 64 m2 + 8 fflj + m^ 



(4) 



20 



The resulting AJc-codebook is accordingly 
composed of 4096 waveforms having only 2 to 4 non zero 
impulses . 



Returning to the encoding procedure, it is 
useful to discuss briefly the criterion used to select 

25 the best excitation signal ck. This signal must be 
chosen to minimize, in some ways, the difference 8 - 
8 between the synthesized and original speech 
signals. In original CELP formulation, the excitation 
signal Ck is based on a Mean Squared Error (MSE) 

30 criteria applied to the error A - s»- 8', where 8', 
respectively s', is s, respectively 8, processed by a 
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perceptual weighting filter of the form A(z)/A(zy'^) 
where y » 0.8 is the perceptual constant. In the 
present invention, the sane criterion is used but the 
computations are performed in accordance with a 
5 backward filtering procedure which is now briefly 
recalled. One can refer to the article by J. P. Adoul, 
P. Mabilleau, M. Delprat,. & s. Morissette: "Fast CELP 
coding based on algebraic codes", Proc. IEEE Int»l 
conference on acoustics speech and signal processing, 
10 pp 1957-1960 (April 1987), for more details on this 
procedure. Backward filtering brings the search back 
to the Ck-space. The present invention brings the 
search further back to the Ak-space. This improvement 
together with the very efficient search method used by 
15 controller 109 (Figure l) and discussed hereinafter 
enables a tremendous reduction in computation 
complexity with regard to the conventional approaches. 

It should be noted here that the combined 
JO transfer function of the filters 103 and 107 (Figure 
1) is precisely the same as that of the above 
mentioned perceptual weighting filter which transforms 
8 into 8', that is transforms 8 into the domain where 
the MSE criterion can be applied. 

15 

Step ?04t To carry out this step, a pitch extractor 
104 (Figure 1) is used to compute and quantize the LTP 
parameters , namely the pitch delay T ranging from 
Tmin to Tmax (20 to 146 samples in the preferred 
0 embodiment) and the pitch gain g. step 304 itself 
comprises a plurality of steps as illustrated in 
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Figure 4. Referring now to Figure 4, a target signal 
Y is calculated by filtering (step 402) the residual 
signal R through the perceptual filter 107 with its 
Initial state set (step 401) to the value FS available 
5 from an initial state extractor lio. The initial 
state of the extractor 104 is also set to the value FS 
as illustrated in Figure.!. The long term prediction 
component of the signal excitation, E(n) , is not known 
for the current values n « 1, 2, ... The values E(n) 
10 for n " 1 to L-Tmin+1 are accordingly estimated using 
the residual signal R available from the filter 103 
(step 403). More specifically, E(n) is made equal to 
R(n) for these values of n. in order to start the 
search for the best pitch delay T, two variables Max 
15 and r are initialized to 0 and Tmin respectively (step 
404) . With the initial state set to zero (step 405) , 
the long term prediction part of the signal excitation 
shifted by the value r, E(n-T), is processed by the 
perceptual filter 107 to obtain the signal 2. The 
crosscorrelation p between the signals T and a is then 
computed using the expression in block 406 of Figure 
4. If the crosscorrelation p is greater than the 
variable Max (step 407) , the pitch delay T is updated 
to T, the variable Max is updated to the value of the 
crosscorrelation p and the pitch energy term a equal 
to |ZB is stored (step 410). if t is smaller than 
Tmax (step 411), it is incremented by one (step 409) 
and the search procedure continues. When t reaches 
Tmax, the optimum pitch ■ gain b is computed and 
quantized using the expression b=Max/ap (step 412). 



20 



25 



30 
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Step 305! In Step 305, a filter responses 
characterlzer 105 (Figure 1) is supplied with the STP 
and LTP parameters to compute a filter responses 
characterization FRC for use in the later steps. The 
FRC information consists of the following three 
components where n 2, ... L. It should also be 

noted that the component. f(n) includes the long term 
prediction loop. 



1 

f(n): impulse response of F(z) (5a) 

1-bz*^ 



15 



1 

•h(n) I response of —. .to f(n) (5b) 

20 aI zy'V 



with zero initial state. 



•u(i,j): autocorrelation of h(n); i.e.: 

u(i,j)« Eh(k.i+l)h(k-j+l) ;forl<i<L (5c) 
k=.l 

3 5 and i^j<L ; h(n) = 0 for n < 1 
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The Utility of the FRC information will become 
obvious upon discussion of the forthcoming steps. 

St^p ?Q$; The long term predictor 106 is supplied with 
5 the signal excitation B + gCk to compute the component 
E of this excitation contributed by the long term 
prediction (parameters LTP) using the proper pitch 
delay T and gain b. The predictor 106 has the same 
transfer function as the long term predictor 203 of 
10 Figure 2. 

?t^p ?Q7; In this step, the initial state of the 
perceptual filter 107 is set to the value PS supplied 
by the initial state extractor 110, The difference R- 

15 B calculated by a subtracter 121 (Figure l) is then 
supplied to the perceptual filter 107 to obtain at the 
output of the latter filter a target block signal X. 
As illustrated in Figure 1, the STP parameters are 
applied to the filter 107 to vary its transfer 

20 function in relation to these parameters. Basically, 
X = s« - p where P represents the contribution of the 
long term prediction (LTP) including "ringing" from 
the past excitations. The MSB criterion which applies 
to A can now be stated in the following matrix 

25 notations. 



mlnjAp =min|S'-sf = mln|s'-[p-gAKHTj^ (6) 
30 ="jJn|X-gA,jH^ 
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Where H accounts for the global filter transfer 
function F(z)/(1-B(z) )A(2y-') . It is an L x L lower 
triangular Toeplitz matrix formed from the h(n) 
response. 



10 



15 



gtgp 309; This is the backward filtering step 

performed by the filter 108 of Figure 1. Setting to 
zero the derivative of the above equation (6) with 
respect to the code gain g yields to the optimum gain 
as follows: 



(7) 



With this value for g the minimization becomes: 



20 



25 



30 



k ' ■ k ' 



(x(AkH^^y 



(x(AkHt/ 

OTH k a\ 

where D = (XH) and a\^\Ai,H^. 



= max 

k 



= max 

k 



(8) 
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In s-tep 308, the backvard filtered target 
signal Z>s(ZH) is computed. The tern "backward 
filtering" for this operation comes from the 
interpretation of (XH) as the filtering of time- 
5 reversed X. 

Step 309t In this step performed by the optimizing 
controller 109 of Figure l, equation (8) is optimized 
by computing the ratio (DAkVak)^ ^ p^k/a^Jc for each 
10 sparce algebraic codeword Ak. The denominator is given 
by the expression: 



15 



25 . 



30 



where U is the Toeplitz matrix of the autocorrelations 
defined in equation (5c). Calling S(l) and p(i) 
respectively the amplitude and position of the ith non 
20 zero impulse (1 = i, 2, . . .N) , the numerator and 
(squared) denominator simplify to the following: 



N N—l N 

= Xs'0)U(Pi.Pi) + 2X 5:S(i)Sa)U(p-,.Pi) ( 10b) 



where P(N) = DAk' 
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A very fast procedure for calculating the above 
defined ratio for each codeword Ak is described in 
Figure 5 as a set of N embedded computation loops, N 
being the nximber of non zero impulses in the 
5 codewords. The quantities S^(i) and SS(i,j) « 
S(i)S(j), for i-1, 2, ..• N and i < j < N are pre- 
stored for maximum speed. Prior to the computations, 
the values for P^^^ and ^^cpt initialized to 

zero and some large number, respectively. As can be 

10 seen in Figure 5, partial sums of the numerator and 
denominator are calculated in each one of the outer 
and inner loops, while in the ixmer loop the largest 
ratio P^(N)/a2(N) is retained as the ratio P^opt/^^t* 
The calculating procedure is believed to be otheirwise 

15 self-explanatoiry from Figure 5. When the N embedded 
loops are completed, the code gain is computed as g « 
^opt / %t t^^' equation (7)). The gain is then 
quantized, the index k is computed from stored impulse 
positions using the expression (4), and the L 

20 components of the scaled optimum code gCk are computed 
as follows: 



N 

gCk(n) = gXf(n-Pi) ;1^n^L 

25 fc,^ (11) 



With f(n)=^0 ; for n<1 



step 310: The global signal excitation signal E + gCk 
30 is computed by an adder 120 (Figure 1} • The initial 
state extractor module 110, constituted by a 
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perceptual filter with a transfer function 1/A(2y *"') 
varying in relation to the STP parameters, subtracts 
from the residual signal R the signal excitation 
signal B + gCk for the sole purpose of obtaining the 
final filter state FS for use as initial state in 
filter 107 and module 104. 

gt^p 311; The set of four pareuneters STP, LTP, k and 
g are converted into the proper digital channel format 
by a multiplexer 111 completing the procedure for 
encoding a block S of samples of speech signal. 

Accordingly, the present invention provides a 
fully quantized Algebraic Code Excited Linear 
Prediction (ACELP) vocoder giving near toll quality at 
rates ranging from 4 to 16 kbits. This is achieved 
through the use of the above described dynamic 
codebook and associated fast search algorithm. 

The drastic complexity reduction that the 
present invention offers when compared to the prior 
art techniques comes from the fact that the search 
procedure can be brought back to Ak-code space by a 
modification of the so called backward filtering 
formulation. In this approach the search reduces to 
finding the index k for which the ratio |DAk^|/ok is 
the largest. In this ratio, Ak is a fixed target 
signal and ok is an energy term the computation of 
which can be done with very few operations by codeword 
when N, the number of non zero components of the 
codeword Ak, is small. 
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Although a preferred embcxJiment of the present 
invention has been described in detail hereinabove, 
this embodiment can be modified at vill, within the 
scope of the appended claims, without depazrting from 
5 the nature and spirit of the invention. As an 
example, many types of algebraic codes can be chosen 
to achieve the same goal of reducing the search 
complexity while many types of coloring filters can be 
used* Also the invention is not limited to the 
10 treatment of a speech signal; other types of sound 
signal can be processed. Such modifications, which 
retain the basic principle of combining an algebraic 
code generator with a coloring filter, are obviously 
within the scope of the subject invention. 
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The embodiments of the invention in which 
an exclusive property or privilege is claimed are 
defined as follows. 



1. A method of producing an excitation 
signal that can be used in synthesizing a sound 
signal, comprising the steps of: 

generating a codeword signal in 
response to an index signal associated to said 
codeword signal, said signal generating step using an 
algebraic code to generate the said codeword signal; 
and 

filtering the so generated codeword 
signal to produce said excitation signal. 



2. A method as defined in claim 1^ in 
which the algebraic code is a sparce algebraic code. 



3. A method as defined in claim 1, 
wherein the excitation signal has frequency 
characteristics, and wherein said filtering step 
comprises processing the codeword signal through a 
coloring filter having a transfer function varying in 
time to thereby shape the frequency characteristics of 
the excitation signal so as to damp frequencies 
perceptually annoying the human ear. 
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4^ A method as defined in claim 3, in 
which the transfer function of the coloring filter is 
varied in relation to linear predictive coding 
parameters representative of spectral characteristics 
of the said sound signal. 

5. A dyneuaic codebook for producing an 
excitation signal that can be used in synthesizing a 
sound signal, comprising: 

means for generating a codeword signal 
in response to an index signal associated to said 
codeword signal, said signal generating means using an 
algebraic code to generate the said codeword signal; 
and 

means for filtering the so generated 
codeword signal to produce said excitation signal. 

6. A codebook as defined in claim 5, in 
which the algebraic code is a sparce algebraic code. 

7. A codebook as defined in claim 5, 
wherein the excitation signal has frequency 
characteristics, and wherein said filtering means 
comprises a coloring filter having a transfer function 
varying in time to shape the frequency characteristics 
of the excitation signal so as to damp frequencies 
perceptually annoying the htiman ear. 
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8. A codebook as defined in claim 7, in 
which the coloring filter comprises an input supplied 
with linear predictive coding parameters 
representative of spectral characteristics of the said 
soiand signal to vary the said transfer function. 



9. A method. of selecting one particular 
algebraic codeword that can be processed to produce a 
signal excitation for a synthesis means capable of 
synthesizing a sound signal, comprising the steps of: 

whitening said soxind signal to be 
synthesized to generate a residual signal; 

computing a target signal X by 
processing a difference between the said residual 
signal and a long term prediction component of said 
signal excitation; 

bacJcward filtering the target signal 
to calculate a value D of the said target signal in 
the domain of an algebraic code; 

calculating, for each codeword among 
a plurality of available algebraic codewords Ak 
expressed in the said algebraic code, a target ratio 
which is function of the value D, the codeword Ak, and 
a transfer function H =» D / X ; and 

selecting the said one particular 
codeword among said plurality of available algebraic 
codewords in function of the calculated target ratios. 



10. The selecting method of claim 9, in 
which said target ratio comprises a numerator given by 
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the expression P^(Jc) « (DAJc^)^ and a denominator 
given by the expression a^k « I XkH^ ^ , where Ak and 
H are under the form of matrix. 



11. The selecting method of claim 10, 
wherein each codeword Ak is a waveform comprising a 
small number of non-zero^ impulses each of which can 
occupy different positions in the waveform to thereby 
enable composition of different codewords. 

12. The selecting method of claim 11, in 
which said target ratio calculating step uses a 
calculating procedure including embedded loops in 
which are calculated contributions of the non-zero 
impulses of the considered algebraic codeword to the 
said numerator and denominator and in which the so 
calculated contributions are added to previously 
calculated sum values of said numerator and 
denominator , respectively • 

13. The selecting method of claim 12, 
wherein the embedded loops comprise an inner loop, and 
wherein the said codeword selecting step comprises the 
steps of: 

processing in the inner loop the said 
calculated target ratios to detennine an optimized 
target ratio; and 
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selecting the said one particular 
algebraic codeword in function of said optimized 
target ratio. 



14. The selecting aethod of claia 9, 
wherein the said codeword selecting step comprises the 
steps of; 

processing the said calculated target 
ratios to determine an optimized target ratio; and 

selecting the said one particular 
algebraic codeword in fvinction of said optimized 
target ratio. 



15. An encoder for selecting one 
particular algebraic codeword that can be processed to 
produce a signal excitation for a synthesis means 
capable of synthesizing a sound signal, comprising: 

means for whitening said sound signal 
to be synthesized and thereby generating a residual 
signal; 

means for computing a target signal 
X by processing a difference between the said residual 
signal and a long term prediction component of said 
signal excitation; 

means for backward filtering the 
target signal to calculate a value D of the said 
target signal in the domain of an algebraic code; 

means fot calculating, for each 
codeword among a plurality of available algebraic 
codewords Ale expressed in the said algebraic code, a 
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target ratio which is function of the value D, the 
codeword Ak^ and a transfer function H « D / X ; and 

means for selecting the said one 
particular codeword among said plurality of available 
algebraic codewords in function of the calculated 
target ratios. 

16. The encoder of claim 15^ in which said 
target ratio comprises a numerator given by the 
expression P^(lc) « (DAk^)^ and a denominator given by 
the expression ot^k » I AkH^ ^ , where Ak and H are 
under the form of matrix. 



17. The encoder of claim 16, wherein each 
codeword Ak is a waveform comprising a small number of 
non-zero Impulses each of which can occupy different 
positions in the waveform to thereby enable 
composition of different codewords. 

18. The encoder of claim 17, in which said 
target ratio calculating means comprises means for 
calculating into a plurality of embedded loops 
contributions of the non-zero impulses of the 
considered algebraic codeword to the said nximerator 
and denominator and for adding the so calculated 
contributions to previously calculated sum values of 
said nvunerator and denominator, respectively. 
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19. The encoder of claim 18, wherein the 
embedded loops comprise an inner loop, and wherein the 
said codeword selecting means comprises: 

means for processing in the inner loop 
the said calculated target ratios to determine an 
optimized target ratio; and 

means for selecting the said one 
particular algebraic codeword in function of said 
optimized target ratio, 

20* The encoder of claim 15, wherein the 
said codeword selecting means comprises: 

means for processing the said 
calculated target ratios to determine an optimized 
target ratio; and 

means for selecting the said one 
particular algebraic codeword in function of said 
optimized target ratio. 



21. A method of generating at least one 
long term prediction parameter related to a sound 
signal in view of encoding the said soxind signal, 
comprising the steps of: 

whitening said sound signal to 
generate a residual signal; 

producing a long texm prediction 
component of a signal excitation for a synthesis means 
capable of synthesizing the said sound signal, said 
producing step including estimating an unknown portion 
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Of the long term prediction component with the said 
residual signal; and 

calculating the said at least on long 
term prediction parameter in function of the so 
produced long term prediction component of said signal 
excitation. 



22. A device for generating at least one 
long term prediction parameter related to a sound 
signal in view of encoding the said sound signal, 
comprising: 

means for whitening said sound signal 
and thereby generating a residual signal; 

means for producing a long term 
prediction component of a signal excitation for a 
synthesis means capable of synthesizing the said sound 
signal, said producing means including means for 
estimating an unJcnown portion of the long term 
prediction component with the said residual signal; 
and 

means for calculating the said at 
least one long term prediction parameter in function 
of the so produced long term prediction component of 
said signal excitation. 
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^START 



INPUT NEXT BLOCK OF SPEECH SAMPLES 
X 



I 



COMPUTE & QUANTIZE SPECTRAL PARAMETERS: 

STP 



I 



COMPUTE RESIDUAL: 
R 



I 



COMPUTE AND QUANTIZE PITCH PARAMETERS: 

LTP 



I 



COMPUTE FILTER RESPONSES CHARACTERIZATION: 

FRC 



I 



COMPUTE LONG TERM PREDICTION PART OF THE 
. EXCITATION: E 



I 



COMPUTE SIGNAL TARGET: 

X 



I 



-^301 



Y^302 



-303 



-304 



-305 



^305 



-307 



BACKWARD FILTER TO OBTAIN TARGET IN A-CODE SPACE: 

D 



I 



FIND BEST CODE & GAIN AND GENERATE SCALED CODE: 



I 



COMPUTE INITIAL FILTER STATE: 
FS 




308 



-309 



■3/0 



STOP 
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401- 



\ 




^TART 



402 



40 J 



btl INITIAL STATE OF PERCEPTUAL FILTER 107 TO FS 
WHICH IS PROVIDED BY MODULE 110 

I f zzn 

COMPUTE TARGET SIGNAL Y 
" I BY FILTERIN G R THROUGH PERCEPTUAL FILTER 107 



EXTEND BEYOND KNOWN EXCIIAIION USING RESIDUAL 

FOR n=1 TO L-Tmin + 1 
DO E(^^R(n) 



SET MAX = 0 
SETt^ Tmin 



404 



. WITH ZERO INITIAL STATE PROCESS SHIFTED EXCITATION 
|E(n-T). THROUGH PERCEPTUAL RLTER 107 AND OBTAIN *Z 

^ 



T = T + 1 



40P 



I 



COMPUTE p SUCH THAT 



40S 




'40S 



407 



SET THE LTP PERIOD T 
SET MAX = p 
SETo^p =|Z| 



= T 




4W 



411 



412 



COMPUTE U QUANTIZE LTP GAIN, b, WHERE 

b = MAX/6^p 




STOP 
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